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Drug resistance to HIV-1 Protease involves accumulation of multiple mutations in the protein. 
Here we investigate the role of these mutations by using molecular dynamics simulations which 
exploit the influence of the native-state topology in the folding process. Our calculations show that 
sites contributing to phenotypic resistance of FDA-approved drugs are among the most sensitive 
positions for the stability of partially folded states and should play a relevant role in the folding pro- 
cess. Furthermore, associations between amino acid sites mutating under drug treatment are shown 
to be statistically correlated. The striking correlation between clinical data and our calculations 
suggest a novel approach to the design of drugs tailored to bind regions crucial not only for protein 
function but also for folding. 



I. INTRODUCTION 



The human immunodeficiency virus encodes a protease (HIV-1 PR) cleaving the gag and the gag-pol viral polypro- 
teins into enzymes and structural proteins [|lj. The discovery that inhibition of this protease (a homodimer of 198 
amino acids) causes the formation of non-infectious virus particle has prompted an enormous effort to design efficient 
inhibitors against AIDS attack ^]. Currently, five antiviral agents are approved by FDA: Saquinavir (SQV), Riton- 
avir (RTN), Indinavir (IND), Nelfmavir (NLF), Amprenavir (APR) and several others are under clinical trials 
Therapeutic benefit is unfortunately short-lived as the virus strains - evolving under drugs' selective pressure - encode 
HIV-1 PR multiple mutants with low drugs affinity 0^-^]: mutants resistant to protease inhibitors can emerge in 
vivo already after less than one year j^] . The problem of drug-resistance persists also when a combination of PR and 
reverse transcriptase (RT) inhibitor are used. 

The occurrence of mutations withstanding antiviral drugs is not a mere consequence of drug action, rather it results 
from viral replication itself j5j. Indeed, mutations are found rather irrespective of drug structural diversity involving 
virtually every protein domain: this is the case for HIV-1 PR mutants which are resistant against FDA approved 
drugs Q. Among such mutations, a few involve the active site (residue 25 - aspartic acid J(|) while the others belong 
to protein regions away from it. 

By means of molecular dynamics (MD) simulations within the framework introduced in Ref. J^] (see next section), 
we show here that the positions where these mutations occur play a key role in the proximity of temperatures where 
the specific heat peaks occur. Moreover, we shall argue that residues involved in a frequently observed covariant 
mutation are statistically correlated. Indeed, while the study of the native state structure can lead to a rational 
design of drugs binding the active site (or otherwise disrupt the biological function of the agent by acting on its native 
structure) the analysis of the folding pathways can provide fundamental information [ jTo| . In particular, it can reveal, 
as in the case of HIV-1 PR, the presence of kinetic bottlenecks associated to severe entropy reduction that inhibits 
the progress toward the native state. These bottlenecks represent the most delicate part of the folding process. They 
are followed by the sudden formation of specific native-like protein sub-regions and afterwards the folding process 
proceeds rapidly until another bottleneck is reached. The identification of sites involved in the bottlenecks and their 
correlation with the active site is crucial pharmaceutically because they are the ideal targets of effective drugs. From 
this point of view, due to the large amount of data available on drug resistance, HIV-1 PR is an excellent candidate 
to validate our automatic strategy to identify key folding sites. In the following, we shall present evidence showing 
that the crucial sites can be identified with good statistical confidence. The framework introduced here is general 
and, applied to other viral proteins, ought to be useful for suggesting which sites should be preferentially targeted by 
effective drugs. 
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II. THEORY 



The strategy adopted here to identify the crucial sites for the folding and assembling of HIV-1 PR is based on a 
recent theoretical framework (|Jll[] that allows to capture the main features of the folding process by a simplified 
description of both the protein structure and the folding dynamics. At the basis of the method is the observation 
that the topology of the native state plays a crucial role in steering the folding process |s|Jl^-|l7[| . This statement is 
supported by an increasing amount of experimental evidence. Perhaps, the most notable examples are: (a) the close 
similarity of the transition-state conformations of proteins having structurally-related native states (despite the very 
poor sequence similarity) [T^JT9|] and (b) the strong influence that certain simple topological properties, such as the 
contact order, have on protein folding rates pof . Such observations, and others summarised in the recent review of 
D. Baker pjj, complement the findings of Anfinsen, who established that a protein's amino acid sequence uniquely 



encodes its native state 22 . Indeed, since the topology of the native state influences the folding process, the amino 
acid sequence must also encode its possible folding pathways. 

We focus our attention on the topological rate-limiting steps along the pathways from unfolded states to the native 
one. Such bottleneck stages, are usually found in correspondence of non-local amino acid interactions that require 
the overcoming of a large entropy barrier (due to the flexibility of the peptide chain intervening between them); the 
formation of such crucial contacts acts as a nucleus for the establishment of further native interactions and leads to 
a rapid progress along the folding reaction coordinate until another barrier is met. 

It is striking that the sites involved in the topological bottlenecks are those where the largest changes in the folding 
kinetics are observed in site-directed mutagenesis experiments fTpf] , as first established for CI2 and Barnase 0. This 
shows that nature has carefully optimised the protein sequence so to exploit the conformational entropy reduction 
accompanying the folding process (2^] through the careful choice of the amino acids forming the crucial contacts. 

With the purpose of identifying the key sites we investigate the topological obstacles encountered during the 
formation of the native HIV-1 PR structure. Such sites are, intuitively, the ideal candidate targets of effective drugs, 
as they take part to the most delicate steps of the folding process. This fact was first recognised by Anfinsen in 
connection with the staphylococcal nuclease ^2|. The most effective strategy to prevent the protein formation is 
acting on residues involved in the key contacts and undermining the formation/overcoming of bottleneck stages. One 
of the distinctive features of the HIV virus is the extremely high rate of mutations. The capability of encoding several 
mutants provides a possibility for HIV-1 PR to elude the disruptive action of the drug by intervening on the key sites. 
This seems an unavoidable countermeasure since the viable mutants (i.e. those with native-like enzymatic activity) 
retain the original native structure and hence, arguably, encounter the same bottlenecks as the wild-type. Within this 
framework, the lapse of time during which the drug therapy is temporarily effective, corresponds to the time taken 
by the virus to encode, through random mutations, a mutant form of HIV-1 PR where the crucial sites have been 
fine-tuned to overcome not only the kinetic bottlenecks (as for the wild type) but also the additional drug attack. 
The key sites identified through the method explained in the next sections, have been compared with the known 
key mutating positions of HIV-1 PR, finding a highly significant correlation between the two of them. In addition, 
previously unexplained co- variant mutations seen in HIV-1 PR are explained as arising due to the correlation between 
distinct topological bottlenecks. 



III. METHODS 

The model that we adopted encompasses an energy-scoring function of the Go- type |24|] . This is one of the simplest 
energy functionals and provides a natural topological bias to the native state by rewarding the formation of native 
pairwise interactions. In the version used here, which is a generalization of ref. || apt for molecular dynamics studies, 
the cooperativity of the folding process is enhanced by the introduction of repulsive non- native interactions. In our 
Hamiltonian, each pair of non-consecutive amino acids interacts with the following strength: 
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Vl(1 



(1) 



where r = 6.8A, rf^ denotes the distance of C a atoms of amino acids i and j in the native structure and 
is the native contact map, whose entries are 1 (0) if i and j are (not) in contact in the native conformation (i.e. 
below or above 6.5 A). Vq and V± are constants controlling the strength of interactions (Vq — 20, V\ = Vq/400 in 
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our simulations). In addition, the peptide bond between two consecutive amino acids, i, i + 1 at distance r-j^+i is 
described by the unharmonic potential: 

|0V+1 ~ fd) 2 + -r d ) A (2) 

with parameters a — Vq, b = lOVb, and r d — 3.8 Ais the rest distance between consecutive C' a atoms. 

It is important to notice that in Eq. (|l]) the formation of any native contact is rewarded in the same way, since Vb 
does not depend on i and j. This choice is done deliberately, so that the only information entering Eq. (|l|) is the native 
contact map and not the types of interacting amino acids (i.e. no sequence information). While by construction, the 
minimum of the energy scoring function is achieved in correspondence of the native state, there is no a priori guarantee 
that the folding process, under the influence the pair interaction (1), occurs, on average, through the same stages 
encountered in nature, or even in a more sophisticated atomic MD simulation with ab initio force fields. Certainly, 
there are situations where the influence of the native-state topology on the folding process may be overridden by 
strong chemical propensities to form definite pairs of amino acids (such as disulfide bridges). In addition, given the 
explicit bias towards the native state, one should not expect that it would be possible to observe intermediate states 
with low concentration of native contacts. Aside from similar circumstances, it is appropriate to ask whether one can 
reproduce the key steps of the folding process by exploiting only the structural information of the native state. The 
basis and justification for the present study is the growing evidence that the above question has a positive answer. In 
fact, starting from the work of ref. ]9| and later of refs. Jl^-fTi]], it has become clear that the characterization of the 
transition states can be confidently done within a Go-model scheme for a variety of proteins. 

In the present study, the starting structural model (target) is the free enzyme Q which is a homodimer with each 
subunit composed by 99 residues, see Fig [l]. Following (^5), the crystallographic C2 symmetry was enforced during 
MD simulations to reduce the computational effort. The progress towards the fully folded (native) state was estimated 
in terms of the fraction of native contacts formed at any given time in the partially folded structure, T This 
quantity, also termed overlap, is defined as 

V e N ■ e r 

Q= ^ 13 N 13 , (3) 

where e r is the contact matrix of T. 

Constant temperature MD simulations were carried out for several decreasing temperatures from the unfolded to 
folded state of the protein. The equations of motion for the C a atoms were integrated by a velocity- Verlet algorithm 
with time step At — 0.01 combined with the standard Gaussian isokinetic scheme (2?|]. We performed unfolding 
simulations within the same framework by starting from the native structure and taking it through a sequence of 
increasing temperatures (heat denaturation). The temperature was measured in reduced units Vo/ke (being ks the 
Boltzmann constant, and Vo the energy of the native contacts in Eq. ([!])). At each temperature, we let the system 
equilibrate from the last structure reached at the previous run at a nearby temperature. Each equilibration involved 
5 • I0 5 MD steps, a time much longer than the largest correlation times observed for the system. After equilibration, 
we sampled 4000 structures again at time intervals twice the estimated correlation time. At each temperature, we 
collected the energy histogram of such uncorrelated structures. Using the multiple histogram techniques ]2q| , the 
energy measurements for all temperatures have been reweighted to provide optimal estimates of thermodynamic 
quantities such as the average energy and the specific heat (by differentiation of the former) for a continuous range 
of temperatures. The statistical significance of the data collected in our runs was checked by verifying that the 
reweighted thermodynamic quantities did not change by more than a few percent upon addition of energy histograms 
obtained from folding/unfolding simulations with different temperature schedules or initial conditions. 

Within the approximation where the reaction coordinate is the internal energy, from the high temperature side, 
the slowest dynamics occurs at temperatures near the specific heat peak, with a relaxation time at least of order 
D ■ TCy, where D is a suitable coefficient dimension of seconds/ Joules. Thus the contacts contributing more to 
the specific heat peak are identified as the key ones belonging to the folding bottleneck and sites sharing them as 
the most probable to be sensitive to mutations. Furthermore, by following several individual folding trajectories 
(by suddenly quenching unfolded conformations below the folding temperature, Tf id) we ascertained that all such 
dynamical pathways encountered the kinetic bottlenecks described in the next section. 

A reliable and convenient way to identify and characterize the kinetic bottlenecks is through the location of peaks 
and shoulders in the specific heat (which denote the overcoming of free energy barriers). Moreover, since the specific 
heat results from the contribution of each pair of native contacts, it is also useful to monitor the formation of each 
native interaction throughout the folding process. Indeed, the probability of formation of a native contact is a 
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decreasing function of T and has a sigmoidal shape fitted by suitably shifted hyperbolic tangent (see Fig. g) . The 
smooth interpolation allows to identify a crossover temperature, To, where the slope reaches its maximum, Co (that 
is the inflection point of the curves in Fig. ||). To defines a local "transition" temperature at which each contact is 
locked, whereas Co, which represents the "rapidity" of its formation, can also be regarded as a measure of the local 
contribution to the specific heat. 

IV. RESULTS 

At very low temperatures, the observed structures have nearly 100 % native-state similarity, measured as the 
fraction Q of established native contacts (Eq. ^|). Further increase in temperature causes structural rearrangements 
into a configuration that cannot be assembled into a dimer anymore (Fig. ||): the number of subunit-subunit contacts 
vanished and the two subunits behaved independently. The dissociation mechanism is well described by Fig. |4] where 
we report, for several temperatures, the fractional occupation of native contacts for the individual subunits and at the 
monomer-monomer interface. The dissociation is also signalled by an abrupt increase of the specific heat of the dimer, 
see inset of Fig.|[ (this defines the the dissociation temperature Tdi SS )- A typical structure at this temperature is 
shown in Fig. ||a. At even higher temperatures (T = lATdi sa ), a large increase of specific heat is observed, indicating 
the presence of a strong transition of the single subunits [^9| , see Fig. [|. This temperature is identified with the folding- 
temperature, Tfoid- Consistently with other studies on different proteins, the native overlap at Tf id was about 50 %. 
A typical structure at this temperature is shown in Fig. |§|b. 

A further set of bottlenecks is encountered at T » 1.4Tt id, where the formation of the three /3-strands of HIV-1 PR 
is involved. Upon increasing the temperature one encounters /3-sheet p2, then Pi, then P3. It is found that the kinetic 
bottleneck for the a general /3-sheet formation is not the establishment of the contact (s) closest to the j3 turn (that 
involves amino acids near in sequence) but it is located further away. A quantitative analysis of the amino acids most 
involved in the folding bottleneck is again obtained by monitoring during the folding/unfolding process each pair of 
amino acids which are in contact in the native state. Examples of the probability with which individual contacts are 
formed is shown in Fig. |[ 

At each temperature where the dynamical evolution of the HIV-1 PR is followed, the formation probability of each 
native contact (fractional occupation) is calculated. Such quantities are 1 at very low temperatures (all native contacts 
always present) and decrease to zero at temperatures larger than the folding temperature. It may be anticipated that 
the rate of decrease as a function of temperature will not be the same for all contacting pairs. In particular, trivial 
local contacts between residues with a small sequence separation will have a large probability of being formed even at 
high temperatures. Our interest focuses on those contacts which show a dramatic increase of the fractional occupation 
near the folding transition. Those will be the key contacts responsible for the appearance of the specific heat peak. 
Examples of the fractional occupation for three native contacts is shown in Fig. 3. Given the monotonic behaviour of 
the fractional occupation, one may synthetically characterize the formation of each contacting pair by the temperature 
at which the point of inflection of the curve is seen and also by the slope at that very same point. Both data can be 
conveniently summarised in two scatter plots where the slope, Co, and the temperature of formation, To, are reported 
for each residue taking part in native contacts. Such graphs are reported in the scatter plot of Fig. |^. Notice that, 
for each site, there are as many entries as the number of contacts involving it (a number that typically differs from 
site to site). Figure || clearly shows that there are clusters of contacts that are turned on at similar temperatures. 

The bottlenecks for the folding process are identified by isolating the contacts having both a formation temperature, 
To matching the location of the peaks and shoulders of the specific heat, and a high rapidity of formation, Co- Figure 
reveals the presence of four distinct clusters of contacts. The first three, labelled f3%, 02, P3, are associated with 
the formation of the three antiparallel freia-sheets in HIV-1 PR. Their temperature of formation is about 1.4 • Tf id, 
and corresponds to the shoulder visible in the larger plot of the specific heat of Fig. [| The sites sharing the most 
important contacts involved in such three bottlenecks are listed in Table | and highlighted in Fig. ^c, where a typical 
structure at T = 1A-Tf id is shown. It is interesting to see that To is maximum for sites close to the /3-turn, in accord 
with the intuitive expectation that the /3 formation is initiated at the turn. In Fig. ^ a,b and c, we have reported 
the values of Co only for the pairs of contacting sites in the j3 sheet. It is seen that the sites closest to the turn have 
a small formation rapidity. This can be understood since, being very close along the sequence, they can be easily 
formed/broken. The highest rapidity, Co, i.e. the highest difficulty of formation, is encountered typically 3-4 sites 
away from the turn. The corresponding contacts are then identified as the bottleneck for this particular folding stage. 
For the /3-sheets, the bottlenecks involve amino acids that are typically 3-4 residues away from the turns. 

Going back to Fig. 5, one observes that there is a fourth group of contacts around residues 30 and 86, labelled TB 
after "transition bottleneck", that are formed cooperatively at the folding transition. The sites involved in the TB 
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contacts are listed in Table |. Among those contacts we have recorded the largest values of Co, as shown more clearly 
in Fig. 6d. Again, we considered the sites with the highest values of Co as responsible for the main bottleneck of the 
folding process. The highest "rapidity" is measured in correspondence of contacts 29-86 and 30-84 (see also Fig. |?]) 
which are, consequently, identified as the most crucial for the folding/unfolding process. 



V. DISCUSSION 



The sites involved in the main folding bottleneck (TB) are located at the active site of HIV-1 PR, which is targeted 
by anti AIDS drugs Hence, within the limitations of our simplified approach, we predict that changes in the 
detailed chemistry at the active site affect also a key step of the folding process. To counteract the drug action, 
the virus has to perform some very delicate mutations in correspondence of the key sites; within a random mutation 
scheme this requires many trials (occurring over several months) . The time required for the biosynthesis of a mutant 
with native-like activity is even longer if the drug attack correlates with several bottlenecks simultaneously. 

This is certainly the case for several anti- AIDS drugs. Indeed Table | summarises the mutations emerged for the 
FDA approved drugs fl. Remarkably, among the first 23 most crucial sites predicted by our method and listed in 
Table |), there are 7 sites in common with the 16 distinct mutating sites of Table |. The probability that two sets of 
16 and 23 sites randomly taken from a total population of 99 (the length of the HIV-1 PR monomer) share at least 
7 sites is only 3 %. Also note that, all the mutation sites of Table | except 82,35,36 and 90 fall within a mismatch of 
at most one position from the sites of Table ||. These results highlight the highly statistical correlation between our 
prediction and the evidence accumulated from clinical trials. 

All mutations causing resistance involve crucial residues for the main folding bottleneck, (particularly residue 84) 
in combination with key sites for one or more of the /3 sheets. Mutation in this sites are expected to modify the 
energetics and structure of partially folded states. In contrast the folded state appears to be weakly affected by 
specific mutations, such as M46I, L63P, V82T, I84V, which lead to a C Q RMS distance of 0.5 Afrom the wild-type 
|55],[36]]. In the light of these results, it is possible to interpret the experimental evidence for the existence of correlations 
between mutations at residue 82 and residues 10, 54, 71 as correlations between the main kinetic barrier, TB and the 
others (3%, @2, The large separation of these associated sites, both along the sequence and in space, suggests that 
their correlations arise by virtue of the folding process itself. This kinetic effects is particularly clear in one of these 
cases, namely the co-mutation of sites in the TB and at residue 10, which occurs under IND therapy. The mediator 
of the correlation is residue 23 which takes part to two bottlenecks: TB and (3\ through direct contact with residues 
84 and 10, respectively. Co-varying mutations between the two sites are observed because changes in TB will affect 
the environment of the other key site 10, which has to mutate accordingly. 



VI. CONCLUSIONS 



The strategy presented here allows both to identify the bottleneck of the folding process and to explain their 
highly significant match with known mutating residues. This approach should be readily applicable to identify the 
kinetic bottlenecks of other viral enzymes of pharmaceutical interest, thus aiding the development of novel inhibitor 
targetting the kinetic bottlenecks. This is expected to enhance dramatically the difficulty for the virus to express 
mutated proteins which still fold efficiently into the same native state with unaltered functionality. 

Acknowledgements This work was supported by INFM and MURST. 



[1] Gulnik S., Erickson J.W., Xie D. Vitam Horm., 58, 213-56 (2000). 

[2] Wlodawer A, Erickson J.W. Annu Rev Biochem, 62, 543-585, (1993) and references therein. 

[3] P. Reddy and J. Ross, Formulary, 34, 567-675 (1999). 

[4] Ala P.J, et al. Biochemistry 37, 15042-15049, (1998). 

[5] Condra JH et al. Nature, 374, 569-571. (1995) 

[6] Brown A. J., Korber B.T., Condra J.H., AIDS Res. Hum. Retroviruses ,15, 247-53 (1999). 

[7] Durant J., et al. Lancet , 353, 2195-9. (1999) 

[8] Boucher C, AIDS 10, S15-9 (1996) 



5 



[9] Micheletti, C, Banavar, J.R., Maritan, A. & Seno., F. Phys. Rev. Lett. 82, 3372-3375, (1999). 
[10] Fersht A.R., Proc. NAtl. Acad. Set. USA, 92, 10869-10873 (1995). 
[11] Maritan, A., Micheletti, C. & Banavar, J.R. Phys. Rev. Lett., 84, 3009-3012, (2000). 
[12] O. V. Galzitskaya and A. V. Finkelstein, Proc. Natl. Acad. Sci. USA, 96, 11299-11304 (1999). 
[13] V. Mufioz, E. R. Henry, J. Hofrichter and W. A. Eaton, Proc. Natl. Acad. Set. USA, 95, 5872 (1998). 
[14] Aim E. and Baker D. Proc. Natl. Acad. Set. USA 96, 11305-11310 (1999). 
[15] Chiti, F., et al. Nature Struct. Biol, 6, 1005-1009 (1999). 

[16] J. C. Martinez, M. T. Pisabarro and L. Serrano. Nature Struct. Biol, 5, 721-729 (1999). 

[17] C. Clementi, H. Nymeyer and J. N. Onuchic, J. Mol. Biol, 298, 937-953 (2000). 

[18] J. C. Martinez and L. Serrano, Nature, Struct Biol, 6, 1010-1016 (1999) 

[19] D. S. Riddle et al. , Nature Struct. Biol. , 6, 1016-1024 (1998). 

[20] K. W. Plaxco, K. T. Simons, and D: Baker, J. Mol. Biol, 277, 985-994 (1998) 

[21] D. Baker, Nature, 405, 39-42 (2000) 

[22] Anfinsen, C. Science 181, 223-230 (1973). 

[23] P. G. Wolynes J. N. Onuchic and D. Thirumalai, Science 267, 1619-1620 (1995); 
[24] Go, N. & Scheraga, H.A. Macromolecules 9, 535-542, (1976). 

[25] Clementi C, Carloni P. and Maritan A., Proc. Natl. Acad. Sci. USA, 96 9616-9621 (1999). 
[26] H. S. Chan and K. A. Dill, Proteins: Str. Fund, and Gen., 30, 2-33 (1998). 

[27] Evans D. J., Hoover W.G., Failor B.H., Moran B., Ladd A.J.C., Phys. Rev. A 28 1016-1021 (1983). 

[28] A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 63, 1195 (1989). 

[29] M. H. Hao and H. A. Scheraga, Physica A, 244, 124-145 (1997). 

[30] Molla A. et al. Nat. Med., 2, 760-6 (1996). 

[31] Markowitz M. et al. J. Virol. , 69, 701-6 (1995). 

[32] Patick A. K., et. al. Antimicrob. Agents Chemother. , 40,292-7 (1996) . 
[33] Tisdale M. at al. Antimicrob. Agents Chemother. ,39, 1704-10 (1995). 
[34] Jacobsen H. et al. J. Infect. Dis. , 173, 1379-87 (1996). 

[35] Chen Z., Li Y., Schock H.B., Hall D., Chen E., Kuo L.C., J Biol Chem, 270, 21433-21436 (1995). 
[36] Nair A.C., Micrtus S., Tossi A., Romeo D., Biochem. Biophys. Res. Commun., 242, 545-551 (1998). 



G 



FIG. 1. Structure of HIV-1 PR @. 
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FIG. 2. Fractional occupation for three different native contacts. In all cases the fractional occupation approaches 1 as T — > 
and vanishes for very large T. However, the rapidity of formation (slope at the inflexion point) is very different. The contact 
binding residues 66 and 69, located at the turn of (5 sheet 1, forms very gradually. The highest rapidity of formation of contacts 
in f}\ is observed for the pair 62, 72 (solid triangles). At the folding temperature, one of the highest formation rapidities is 
found in correspondence of the contact bonding residues 29 and 86 (open squares). The continuous curve is obtained from a 
smooth interpolation of the points. 
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FIG. 3. Typical dimer conformations near the dissociation temperature Tdiss (a). Typical monomer structures at the folding 
transition, Tf id (b) (key residues 29, 32, 76, 86 are highlighted in red) and at 1.3Tf id (c) (sites 11, 21, 46, 55, 61, 74 responsible 
for the initiation of the beta sheets are shown in red.) 
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FIG. 4. Behaviour, as a function of temperature, of the average fractional occupation of native contacts within each HIV-1 
PR monomer (solid circles) and at the interface between the two monomeric units (open circles). The dissociation of the two 
subunits is clearly seen to occur at T/Tf a id = 0.6. 
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FIG. 6. (a) Characteristic temperatures To (a) and maximum "rapidity" Co associated to each native contact versus the 
amino acid position sharing the contact, (b) Distribution of the values of the maximum rapidity Co of contacts involving each 
residue. 
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FIG. 7. Formation "rapidity" Co, for contacts in the four subregions of Fig. ga. In order of decreasing formation temperature 
data are shown for a) fa, b) 0i, c) fi% and d) TB. Notice that the scale of Co depends on temperatures. The highest Co's for 
each of the regions increase as the temperature To decreases. 



Bottleneck 


Key sites 


TB 


22, 29, 32, 76, 84, 86 


Pi 


10,11,13,20,21,23 




44,45,46,55,56,57 




61,62,63,72,74 



TABLE I. Key sites for the four bottlenecks. For each bottleneck, only the sites in the top three pairs of contacts have been 
reported. 
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Name 


Point Mutations 


Bottlenecks 


RTN fe 


m 


20, 33, 35, 36, 46, 54, 63, 71, 82, 84, 90 


TB, ft, ft, ft 


NLF J 


30, 46, 63, 71, 77, 84, 


TB, ft, /3 3 


IND 1 




10, 32, 46, 63,71, 82, 84 


TB, ft, ft, ft 


SQVl 


lljH 


10, 46, 48, 63, 71, 82, 84, 90 


TB, ft, #2, ft 


APR 1 




46, 63, 82, 84 


TB, ft, ft 



TABLE II. Mutations in the protease associated with FDA-approved drug resistance |4|. Sites highlighted in boldface are 
those involved in the folding bottlenecks as predicted in our approach, ft refers to the bottleneck associated to the formation 
of the i-th ftsheet, whereas TB refers to the bottleneck occurring at folding transition temperature Tj id 
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